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Kleene algebra with tests (KAT) is an equational system for program verification, which is the com- 
bination of Boolean algebra (BA) and Kleene algebra (KA), the algebra of regular expressions. In 
particular, KAT subsumes the propositional fragment of Hoare logic (PHL) which is a formal system 
for the specification and verification of programs, and that is currently the base of most tools for 
checking program correctness. Both the equational theory of KAT and the encoding of PHL in KAT 
are known to be decidable. In this paper we present a new decision procedure for the equivalence 
of two KAT expressions based on the notion of partial derivatives. We also introduce the notion 
of derivative modulo particular sets of equations. With this we extend the previous procedure for 
deciding PHL. Some experimental results are also presented. 

1 Introduction 

Kleene algebra with tests (KAT) is an equational algebraic system for reasoning about programs that 
combines Kleene algebra (KA) with Boolean algebra [18]. In particular, KAT subsumes PHL |[T5l . the 
propositional fragment of Hoare logic, which is a formal system for the specification and verification of 
programs, and that is currently the base of most tools for checking program correctness ifTTI . Testing 
if two KAT expressions are equivalent is tantamount to prove that two programs are equivalent or that 
a Hoare triple is valid. Deciding the equivalence of KAT expressions is as hard as deciding regular ex- 
pressions (KA expression) equivalence, i.e. PSPACE-complete [81 . In spite of KAT's success in dealing 
with several software verification tasks, there are very few software applications that implement KAT's 
equational theory and/or provide adequate decision procedures. Most of them are within (interactive) 
theorem provers or part of model checking systems, see flXl [T2J ISl for some examples. 

Based on a rewrite system of Antimirov and Mosses 0, Almeida et al. developed an algorithm 
that decides regular expression equivalence through an iterated process of testing the equivalence of 
their derivatives, without resorting to the classic method of minimal automaton comparison. Statistically 
significant experimental tests showed that this method is, on average and using an uniform distribution, 
more efficient than the classical methods based on automata [2]. Another advantage of this method is that 
it is easily adapted to other Kleene algebra, such as KAT. In this paper we present an extension of that 
decision algorithm to test equivalence in KAT. The termination and correctness of the algorithm follow 
the lines of [3 ], but are also close to the coalgebraic approach to KAT presented by Kozen JT7l- Deciding 
PHL can be reduced to testing KAT expressions equivalence lfl5l . Here we present an alternative method 
by extending the notion of derivative modulo a set of (atomic equational) assumptions. Once again 
the decision procedure has to be only slightly adapted. The new method reduces the size of the KAT 
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expressions to be compared with the cost of a preprocessing phase. All the procedures were implemented 
in OCaml and some experimental results are also presented. 

2 Preliminaries 

We briefly review some basic definitions about regular expressions, Kleene algebras, Kleene algebras 
with tests (KAT), and KAT expressions. For more details, we refer the reader to lfT3l fT4l H~8l [T6l [8ll . 

2.1 Kleene Algebra and Regular Expressions 

Let £ = {/?!,. ..,Pk}, with k > 1, be an alphabet. A word w over £ is any finite sequence of letters. The 
empty word is denoted by 1 . Let £* be the set of all words over £. A language over £ is a subset of £* . 
The left quotient of a language L C £* by a word w E Z* is the language w _1 L = {x G £* | wx S L}. The 
set of regular expressions over E, is defined by: 

r := 0|l|p€E|(n+/5)|(ri-75)|r* (1) 

where the operator • (concatenation) is often omitted. The language if (r) associated to r is inductively 
defined as follows: jSf(O) = 0, if(l) = {1}, JSf(p) = M for ? el, if (n + r 2 ) = if (n) U if (r 2 ), 
if (n • r 2 ) = if (n) • i^('"2) ) and if (r*) = if (r)*. Two regular expressions ri and r 2 are equivalent if 
if(ri) = if(r 2 ), and we write r\ = r 2 . With this interpretation, the algebraic structure (/?£,+, -,0,1) 
constitutes an idempotent semiring, and with the unary operator *, a Kleene algebra. 

A Kleene algebra is an algebraic structure Jfr = {K, +,-,*, 0, 1 ) , satisfying the axioms below. 



r\ + (r 2 + r 3 ) = (n + r 2 ) + r 3 


(2) 


r\ + r 2 = r 2 + r\ 


(3) 


r + = r + r = r 


(4) 


n(/ 2 r 3 ) = (rir 2 )r 3 


(5) 


\r = r\ = r 


(6) 


n(/ 2 + r 3 ) = rir 2 + nr 3 


(7) 


(n +r 2 )r 3 = nr 3 +r 2 r 3 


(8) 


Or = rO =0 


(9) 


1 + rr* < r* 


(10) 


1 +r*r < r* 


(11) 


H +^ 2 r 3 < r 3 -)■ r 2 *ri < r 3 


(12) 


''l +r 2 r 3 < r 2 — )■ nr 3 * < r 2 


(13) 



In the above, < is defined by n < r 2 if and only if r\ + r 2 = r 2 . The axioms say that the structure 
is an idempotent semiring under +, •, and 1 and that * behaves like the Kleene star operator of formal 
language theory. This axiom set (with an usual first-order deduction system) constitutes a complete proof 
system for equivalence between regular expressions fT3l . 
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2.2 Kleene Algebra with Tests and KAT Expressions 

A Kleene algebra with tests (KAT) is a Kleene algebra with an embedded Boolean subalgebra ,J€ = 
(K,B, +,-,*, 0, 1 ,~) where " is an unary operator denoting negation and is defined only on B, such that 

• (K, +, -,*,0, 1) is a Kleene algebra; 

• (B, +, • ~,0, 1) is a Boolean algebra; 

• (B, +, -,0, 1) is a subalgebra of (K, +, -,0, 1). 

Thus, a KAT is an algebraic structure that satisfies the KA axioms ([2])-([T3]) and the axioms for a Boolean 
algebra B. 

Let £ = { p\ , . . . , pk} be a non-empty set of (primitive) action symbols and T = {t\ , . . . , ti} be a non- 
empty set of (primitive) test symbols. The set of boolean expressions over T is denoted by Bexp and the 
set of KAT expressions by Exp, with elements b\,b%,... and e\,e%,..., respectively. The abstract syntax 
of KAT expressions over an alphabet £ U T is given by the following grammar, 

bfEBexp := | 1 1 1 £ T \ b \ b\ +b 2 \ b\ -b 2 
e £ Exp := p £ £ | b £ Bexp | e\ + e 2 \e\-e 2 \ e\ . 

As usual, we often omit the operator • in concatenations and in conjunctions. The standard language- 
theoretic models of KAT are regular sets of guarded strings over alphabets £ and T |[T6l . Let T = {t \ 
t £ T} and let At be the set of atoms, i.e., of all truth assignments to T, 

At = {b\...bi\bi is either or I[ for 1 < i < I and f,- £ T}. 

Then the set of guarded strings over £ and T is GS = (At • £)* • At. Guarded strings will be denoted by 
x,y, . . .. For x = <X\p\<x 2 p 2 ■ ■ ■ p n -\OL n £ GS, where n > 1, a, £ At and pi £ £, we define first(^) = oti 
and last(x) = a„. If last(x) = first(y), then the fusion product xy is defined by concatenating x and y, 
omitting the extra occurrence of the common atom. If last(x) ^ first(y), then xy does not exist. For sets 
X, Y C GS of guarded strings, the set X o Y defines the set of all xy such that x £ X and y £ Y. We have 
that X° = At and X n+1 =XoX n ,forn> 0. 

Every KAT expression e £ Exp denotes a set of guarded strings, GS(e) C GS. Given a KAT expres- 
sion e we define GS(e) inductively as follows, 

GS(p) = {a\pa 2 | a\,a 2 £ At} /?e£ 

GS(b) ={aeAt|a<6} b£ Bexp 

GS(ei+e 2 ) = GSOi) U GS(e 2 ) 

GS(eie 2 ) = GS(ei) oGS(e 2 ) 

GS(e*) = U„> GS(e)". 

We say that two KAT expressions e\ and e 2 are equivalent, and write e\ = e 2 , if and only if GS(ei) = 
GS(e2). Kozen lTT8l showed that one has e\ = e 2 modulo the KAT axioms, if and only if, e\ = e 2 is true 
in the free Kleene algebra with tests on generators £U T. Two sets of KAT expressions E,F C Exp are 
equivalent if and only if GS(.E) = GS(F), where GS^) = U e£ £GS(e). 

3 Deciding Equivalence in KAT 

In this section we present a decision algorithm to test equivalence in KAT. Kozen [17] presented a coal- 
gebraic theory for KAT extending Rutten's coalgebraic approach for KA [20], and improving the frame- 
work of Chen and Pucella [7]. Extending the notion of Brzozowski derivatives to KAT, Kozen proved 
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the existence of a coinductive equivalence procedure. Our approach follows closely that work, but we 
explicitly define the notion of partial derivatives for KAT, and we effectively provide a (inductive) de- 
cision procedure. This decision procedure is an extension of the algorithm for deciding equivalence of 
regular expressions given in 121 |5l, that does not use the axiomatic system. Equivalence of expressions is 
decided through an iterated process of testing the equivalence of their partial derivatives. 

3.1 Derivatives 

Given a set of guarded strings R, its derivative with respect to ctp G At- E, denoted by D ap (R), is defined 
as being the left quotient of R by a p. As such, one considers the following derivative functions, 

D : At • I 5»(GS) -»■ £»(GS) E : At -> 0»(GS) -»■ {0, 1} 

consisting of components, 

D„ p : £»(GS) -)• ^(GS) E«:^(GS)^{0,1} 

defined as follows. For a G At, p G £ and 7? C GS, 

D„„( R ) _ {y< EGS|a ro< E*} and E„(S) = {J 

3.2 Partial Derivatives 

The notion of set of partial derivatives, cf. ||4j [191 . corresponds to a finite set representation of the 
derivatives of an expression. Given a G At, p G L and e G Exp, the set A ap {e) of partial derivatives of e 
with respect to ap is inductively defined as follows, 



A ap(p') 



A : At- 1 -> Exp -> ^(Exp) 

{1} ifp = // 
otherwise 

A ap (6)=0 

A ap (ei + e 2 ) = Aap(ei) U A ap (e 2 ) 

> , n _ / A ap (e l )-e 2 ifE«(«?i) = 

Aap{.eie 2 ) \ A ap ( ei ) ■ e 2 U A ap (e 2 ) if E„( Cl ) = l 

A ap (e*)=A ap (e)-e*, 



where for T C Exp and e G Exp, T • e = { e'e \ e' G T} if e ^ and e / 1, and T • = and T • 1 = T, 
otherwise. We note that A ap {e) corresponds to an equivalence class of D ap {e) (the syntactic Brzozowski 
derivative, defined in ifTTl ") modulo axioms Q-Q, ([8]), and Q. Kozen calls such a structure a n'g/z? 
presemiring. 

The following syntactic definition of E a : At — > Exp — > {0, 1} is from fP7l and simply evaluates an 
expression with respect to the truth assignment a. 

E a {p) =0 E o 0i + <? 2 ) = E a (<?i) + E„(e 2 ) 

1 ifa<6 E a (ei<? 2 ) = E„0i)E a (>2) 



otherwise E„(e*) = 1. 
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One can show that, 

E«(e) 



1 if a < e (1 if a G GS(e) 
ifa^e " j ifa^GS(e). 



The next proposition shows that for all KAT expressions e the set of guarded strings correspondent 
to the set of partial derivatives of e w.r.t. otp G At • E is the derivative of GS(e) by a p. 

Proposition 1. For all KAT expressions e, all atoms a and all symbols p, 

D ap (GS(e)) = GS(A« p (e)). 

Proof. The proof is obtained by induction on the structure of e. We exemplify with the case e = e\e2, 
where 

D 0/ ,(GS(e)) = D B/ ,(GS(ei)oGS(e2)) 

D ap (GS(ei))oGS(e 2 ) ifa^GS(ei) 
D ap (GS(ei))oGS(e 2 ) U D ap {GS{e 2 )) if aeGS(ei) 

applying the induction hypothesis 

(U e , sAap(ei) GS( e / ))oGS(e 2 ) if E„(ei) = 

(U e , eAai!{ei) GS(e'))oGS(e 2 ) U GS(A ap (e 2 )) if E„(ei) = 1 
u e>eA ap (e l )GS{e'e 2 ) if E«(<?i) = 

(U e >eA ap ( ei )GS(e'e 2 )) U GS(A«„( e2 )) if E a («i) = 1 

GS(A ap ( ei )- e2 ) if E„(ci) = 

GS(A ap (ei) -e 2 ) U GS(A ap (e 2 )) if E„(ei) = 1 

= GS(A« p (eie 2 )) = GS(A« p (e)). 

□ 

The notion of partial derivative of an expression w.r.t. ap G At • E can be extended to words * G 
(At-E)*, as follows, 

A : (At • E)* -> Exp^ ^(Exp) 
Ai(e) = {e} 
Kva P (e) = A ap (A w {e)). 

Here, the notion of (partial) derivatives has been extended to sets of KAT expressions E C Exp, by 
defining, as expected, A ap (E) = U £e £A ap (e), for a/? € At-E. Analogously, we also consider A X (E) 
and Ar(E), for x G (At-E)* and/? C (At-E)*. 

The fact, that for any e G Exp the set A( At .£)»(e) is finite, ensures the termination of the decision 
procedure presented in the next section. 



3.3 A Decision Procedure for KAT Expressions Equivalence 

In this section we describe an algorithm for testing the equivalence of a pair of KAT expressions using 
partial derivatives. Following Antimirov |4], and for the sake of efficiency, we define the function f that 
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given an expression e computes the set of pairs (ccp,e'), such that for each ap G At • E, the corresponding 
e' is a partial derivative of e with respect to a p. 

f : Exp ^(At-Ix Exp) 
f(p) = { (a/7,1) | ae At} 
f (ft) = 

f(ei+e 2 )=f(ei)Uf(e 2 ) 

f( ei e 2 )=f(ei)-e 2 U { (ocp,e) G f(e 2 ) | E„(ei) = 1} 

where, as before, r • e = { (ap,e'e) \ (ap,e') G F} if e / and e^l, and T • = and T • 1 = T, 
otherwise. Also, we denote by hd(f(e)) = {ap \ (ap,e') G f(e)} the set of heads (i.e. first components 



of each element) of f(e). The function der ap , defined in (14i, collects all the partial derivatives of an 
expression e w.r.t. ap, that were computed by function f. 

der ap (e) = {e'\(ap,e')ef(e)} (14) 

The proof of the following Proposition is almost trivial and follows from the symmetry of the definitions 
of der ap , f, and A ap . 

Proposition 2. For all e,e' G Exp, a G At and p G £ one has, der«p(^) — Aa p (e). 

To define the decision procedure we need to consider the above functions and the ones defined in 
Section [3T2] applied to sets of KAT expressions. Then, we define the function derivatives that given two 
sets of KAT expressions E\ and £2 computes all pairs of sets of partial derivatives of E\ and £2 w.r.t. 
ap G At • £, respectively. 

derivatives: ^(Exp) 2 ->• ^(^(Exp) 2 ) 

derivatives^!,^) = {(der^^^^erap^)) I «P G hd(£i U E 2 )} 

Finally, we present the function equiv that tests if two (sets of) KAT expressions are equivalent. For two 
sets of KAT expressions E\ and £2 the function returns True, if for every atom a, E a {E\) = E a (£2) and 
if, for every ap, the partial derivative of E\ w.r.t. ap is equivalent to the partial derivative of £2 w.r.t. 
ap. 

equiv : ^(^(Exp) 2 ) x ^(^(Exp) 2 ) {True, False} 

equiv(0,//) = True 

equiv( { (£ 1 ,£ 2 )}U 5 ,//) = { Fal * „ _ ^OC e At : E a (E^ E a (E 2 ) 
H uv 111 1 equiv(5 U S',H') otherwise, 

where 

5' = {d I d G derivatives^, £ 2 ) and d $ H'} and H' = {(£i,£ 2 )} U H. 

The function equiv accepts two sets S and H as arguments. At each step, S contains the pairs of (sets 
of) expressions that still need to be checked for equivalence, whereas H contains the pairs of (sets of) 
expressions that have already been tested. The use of the set H is important to ensure that the derivatives 
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of the same pair of (sets of) expressions are not computed more than once, and thus prevent a possible 
infinite loop. 

To compare two expressions e\ and e%, the initial call must be equ\v({({e\},{e 2 })},(d). At each step 
the function takes a pair (E\,E 2 ) and verifies if there exists an atom a such that E a (E\) / E a (£"2). If 
such an atom exists, then e\ ^ e 2 and the function halts, returning False. If no such atom exists, then 
the function adds (E\,E 2 ) to H and then replaces in S the pair {Ei,E 2 ) by the pairs of its corresponding 
derivatives provided that these are not in H already. The return value of equiv will be the result of 
recursively calling equiv with the new sets as arguments. If the function ever receives as S, then the 
initial call ensures that e\ = e 2 , since all derivatives have been successfully tested, and the function 
returns True. 



3.4 Termination and Correctness 

First, we show that the function equiv terminates. For every KAT expression e, we define the set PD(e) 
and show that, for every KAT expression e, the set of partial derivatives of e is a subset of PD(e), which 
on the other hand is clearly finite. The set PD(e) coincides with the closure of a KAT expression e, 
defined by Kozen, and is also similar to Mirkin's prebases |fT9l . 

prVM _ m PD(e,+e 2 ) ={ e i+e 2 }UPD( ei )uPD(e 2 ) 

PD„ -fn n ={e l e 2 }UPD(e l ).e 2 UPD(e 2 ) 
{P) XP ' * PD(e*) ={e*} U PD(e)-e*. 

Lemma 1. Consider e,e' G Exp, a G At and p G £. If e' G PD(e), then A ap (e') C PD(e). 

Proof. The proof is obtained by induction on the structure of e. We exemplify with the case e = e\e 2 . 
Let / G PV{e x e 2 ) = {e x e 2 } U PD(ei)-e 2 U PD(e 2 ). 

• If e' G {e\e 2 }, then A ap (e') C A ap (ei) ■ e 2 U A ap (e 2 ). But ey G PD(ei) and e 2 G PD(e 2 ), so 
applying the induction hypothesis twice, we obtain A ap {e') C PD(ei) • e 2 U PD{e 2 ) C PD(e). 

• If e' G PD(ei) -e 2 , then e' = e\e 2 such that e\ G PD(ei). So A ap {e') C A ap (e\) ■ e 2 U A ap (e 2 ) 
CPD(ei)-e 2 UPD(e 2 )CPD(e). 

• Finally, if e' G PD(^2), again by the induction hypothesis we have A ap (e') C PD(^2) C PD(e). 

□ 

Proposition 3. For all x G (At • £)*, one has A x (e) C PD(e). 

Proof. We prove this lemma by induction on the length of x. If \x\ = 0, i.e. x=l, then A] (e) = {e} C 
PD(e). If ^ = wap, then A wap = U^ e ^ ^A ap (e'). By induction hypothesis, we know that A w (e) C 
PD(e). ByLemmajTJife'G PD(e), then A ap (e') C PD(e). Consequently, ^^ e ^ e) A ap (e') C PD(e). □ 

Corollary 1. For a// expressions e, the set A^t-E)* ( e ) is finite. 

It is obvious that the previous results also apply to sets of KAT expressions. 
Proposition 4. The function equiv is terminating. 



Proof. When the set S is empty it follows directly from the definition of the function that it terminates. 
We argue that when 5 is not empty the function also terminates based on these two aspects: 
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• In order to ensure that the set of partial derivatives of a pair of (sets of) expressions are not com- 
puted more than once, the set H is used to store the ones which have already been calculated. 

• Each function call removes one pair (£4, £2) from the set S and appends the set of partial deriva- 
tives of {E\,E 2 ), which have not been calculated yet, to S. By Corollary [T| the set of partial 
derivatives of an expression by any word is finite, and so eventually S becomes 0. 

Thus, since at each call the function analyzes one pair from S, after a finite number of calls the function 
terminates. □ 

The next proposition states the correctness of our algorithm. Coalgebraically it states that two KAT 
expressions are equivalent if and only if there exists a bisimulation between them ifTTl Thm. 5.3]. 



Proposition 5. For all KAT expressions e\ and e% 
GS( ei ) = GS(e 2 ) O 



E a (ei) = E a (e 2 ) and 

GS(A« p ( ei )) = GS(A ap (e 2 )), Va G At, Vp G I. 

Proof. Let us first prove the <;= implication. If GS(ei) 7^ GS(e 2 ), then there is x G GS, such that x G 
GS(<?i) and x £ GS(<?2) (or vice-versa). If x = a, then we have E a (<?i) = 1 / = E a (e 2 ) and the test 
fails. If x = apw, such that w G (At •£)* - At, then since apw G GS(ei) and apw ^ GS(<*2), we have that 
w G GS(A« p (ei)) and w £ GS{A ap (e 2 )). Thus, GS(A« p (ei)) / GS(A ap (e 2 )). 

Let us now prove the =^ implication. For a G At, there is either a G GS(ei) and a G GS(e 2 ), thus 
E a (ei) = E a (e 2 ) = 1; or a GS(ei) and a GS(e 2 ), thus E«(ei) = E a (e 2 ) = 0. For ap G At- 1, by 
Proposition [T] one has GS(A ap (ei)) = GS(Aa p (e2)) if and only if D ap {GS(e\)) = D ap (GS(^2))- This 
follows trivially from GS(ei) = GS(e2). □ 



4 Implementation 

The algorithm presented in the previous section was implemented in OCaml [21 ]. Alternations, conjunc- 
tions, and disjunctions are represented by sets, and thus, commutativity and idempotence properties are 
naturally enforced. Concatenations are represented by lists of expressions. Primitive tests occurring in a 
KAT expression are represented by integers, and atoms by lists of boolean values (where primitive tests 
correspond to indexes). For each KAT expression e, we consider At as the set of atoms that correspond 



to the primitive tests that occur in e. The implementation of the functions defined in Section 3.2 and 



Section |3.3| do not differ much from their formal definitions. A common choice was the use of com- 
prehension lists to define the inclusion criteria of elements in a set. Because of our basic representation 
of KAT expressions, we treat in a uniform way both expressions and sets of expressions. The function 
E a , used in equiv, is implemented using a function called eAII, that takes as arguments two (sets of) 
expressions E\ and E 2 and verifies if for every atom the truth assignments for E\ and E 2 coincide. 



4.1 Experimental Results 

In order to test the performance of our decision procedure we ran some experiments. We used the FAdo 
system ||9) to uniformly random generate samples of KAT expressions. Each sample has 10000 KAT 
expressions of a given length \e\ (number of symbols in the syntactic tree of e G Exp). The size of each 
sample is more than enough to ensure results statistically significant with 95% confidence level within a 
5% error margin. The tests were executed in the same computer, an Intel® Xeon® 5140 at 2.33 GHz 
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with 4 GB of RAM, running a minimal 64 bit Linux system. For each sample we performed two experi- 
ments: (1) we tested the equivalence of each KAT expression against itself; (2) we tested the equivalence 
of two consecutive KAT expressions. For each pair of KAT expressions we measured: the size of the set 
H produced by equiv (that measures the number of iterations) and the number of primitive tests in each 
expression (\e\ T ). Table [j] summarizes some of the results obtained. Each row corresponds to a sample, 
where the three first columns characterize the sample, respectively, the number of primitive actions (k), 
the number of primitive tests (/), and the length of each KAT expression generated. Column four has the 
number of primitive tests in each expression (|e|r)- Columns five and six give the average size of H in the 
experiment (1) and (2), respectively. Column seven is the ratio of the equivalent pairs in experiment (2). 
Finally, columns eight and nine contain the average times, in seconds, of each comparison in the experi- 
ments (1) and (2). More than comparing with existent systems, which is difficult by the reasons pointed 
out in the introduction, these experiments aimed to test the feasibility of the procedure. As expected, the 
main bottleneck is the number of different primitive tests in the KAT expressions. 



1 


2 


3 


4 


5 


6 


7 


8 


9 


k 


/ 


\e\ 


\e\ T 


ff(l) 


H(2) 


=(2) 


Time(l) 


Time(2) 


5 


5 


50 


9.98 


7.35 


0.53 


0.42 


0.0097 


0.00087 


5 


5 


100 


19.71 


15.74 


0.76 


0.48 


0.0875 


0.00223 


10 


10 


50 


11.12 


8.30 


0.50 


0.07 


0.5050 


0.30963 


10 


10 


100 


21.93 


16.78 


0.67 


0.18 


20.45 


1.31263 


15 


15 


50 


11.57 


8.47 


0.47 


0.10 


6.4578 


55.22 



Table 1: Experimental results for uniformly random generated KAT expressions. 



5 Hoare Logic and KAT 

Hoare logic was first introduced in 1969, cf. ifTTTl . and is a formal system widely used for the specification 
and verification of programs. Hoare logic uses partial correctness assertions (PCA's) to reason about 
program correctness. A PCA is a triple, {b}P{c} with P being a program, and b and c logic formulas. 
We read such an assertion as if b holds before the execution of P, then c will necessarily hold at the 
end of the execution, provided that P halts. A deductive system of Hoare logic provides inference rules 
for deriving valid PCA's, where rules depend on the program constructs. We consider a simple while 
language, where a program P can be defined, as usual, by an assignment x := v; a skip command; a 
sequence P; Q, conditional if b then P else Q, and a loop while b do P. 

There are several variations of Hoare logic and here we choose an inference system, considered 
in ifTOj . that enjoys the sub-formula property, where the premises of a rule can be obtained from the 
assertions that occur in the rule's conclusion. With this property, given a PCA {b}P{c}, where P has 
also some annotated assertions, it is possible to automatically generate verification conditions that will 
ensure its validity. The inference rules for this system are the following: 

b^c b^c[x/e] {b}P{c} {c}Q{d} 

{b} skip {c} {b} x:=e {c} {b} P;{c}Q {d} 

{b/\c}P{d} {^bAc}Q{d} {bAi}P{i} c^i {iA^b)^d 



{c} if b then P else Q {d} 



{c} while b do {i}P {d} 
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5.1 Encoding Propositional Hoare Logic in KAT 

The propositional fragment of Hoare logic (PHL), i.e., the fragment without the rule for assignment, 
can be encoded in KAT |[T5l . The encoding of an annotated while program P and of our inference 
system follow the same lines. In PHL, all assignment instructions are represented by primitive symbols 
p. The skip command is encoded by a distinguished primitive symbol /? s kip- If £i> ^2 are respectively the 
encodings of programs Pi and P%, then the encoding of more complex constructs of an annotated while 
program involving Pi and P2 is as follows. 

P ;{c} p 2 => e\ce 2 
if b then Pi else P2 => be\ + be 2 
while b do {/} Pi => (bie x )*b 

A PCA of the form {b}P{c} is encoded in KAT as an equational identity of the form 

be = bee or equivalently by bee = 0, 

where e is the encoding of the program P. 

Now, suppose we want to prove the PCA {b}P{c}. Since the inference system for Hoare logic, that 
we are considering in this paper, enjoys the sub-formula property, one can generate mechanically in a 
backward fashion the verification conditions that ensure the PCAs validity. 

Since in the KAT encoding, bee = 0, we do not have the rule for assignment, besides verification 
conditions (proof obligations) of the form b' — > c' we will also have assumptions of the form b'pc' = 0. 

One can generate a set of assumptions, T = Gen(bec), backwards from bee = 0, where Gen is induc- 
tively defined by: 

Gen(bp skip c) = {b < c} 

Gen(bpc) _ = {bpc} _ p skip / p G £ 

Gen(Z? e\ c e 2 d) = Gen{b e\ c) U Gen(c e 2 d) 

Gen(b {ce\ +ce%) d) = Gen(bc e\ d) U Gen(bc e 2 d) 

Gen (b ((cie)*c) d) = Gen(/c e i)U{b <i,ic < d} 

Note that T is necessarily of the form 

r = {bipib[ =0,..., b m p m b' m =0}U{ci <c[,...,c n < c' n }, 

where p\,...,p m G £ and such that all b's and c's are Bexp expressions. In Section[6j we show how one 
can prove the validity of bepc = in the presence of such a set of assumptions T, but first we illustrate 
the encoding and generation of the assumption set with an example. 



5.2 A Small Example 

Consider the program P in Table [2j that calculates the factorial of a non-negative integer. We wish to 
prove that, at the end of the execution, the variable y contains the factorial of x, i.e. to verify the assertion 
{True}P{y=^;!}. 

In order to apply the inference rules we need to annotate program P, obtaining program P'. Applying 
the inference rules for deriving PCAs in a backward fashion to {True} P' {y = xl}, one easily generates 
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Program P 


Annotated Program P' 


Symbols used 
in the encoding 




y:=i; 


Pi 




{y-0!} 


h 


y:=l; 


z:=0; 


Pi 


z:=0; 


{y = z!} 


tz 


while -^z = x do 


while = x do 


h 


{ 


{ 




z := z+1; 


{y=z!} 


tz 


y := yxz; 


z := z+1; 


P3 


} 


{yxz = z!} 


k 




y := yxz; 


PA 




} 





Table 2: A program for the factorial 



the corresponding set of assumptions provided by the annotated version of the program. However, be- 
cause we do not have the assignment rule in the KAT encoding, here we simulate that by considering not 
only verification conditions but also atomic PCA's {b'}x := e{c'}. Thus the assumption set is 

{True}); := l{y = 0!}, {y = 0!}z := 0{y = z\}, 
{y = z! A -iz = x}z := z.+ l{y x z = z\}, {y x z = z\}y :=yx z{y = z\}, 
y = z\ ->■ y = z\,(y = z\ A->-iz = x) -ty = x\ 

On the other hand, using the correspondence of KAT primitive symbols and atomic parts of the 
annotated program P', as in Table |5] and additionally encoding True as to and y = x\ as ts, respectively, 
the encoding of {True} P' {y = x!} in KAT is 

tQPlhp2t2{ht2P3Up^)*Wi = 0. (15) 

The corresponding set of assumptions Y in KAT is 

T = {topih = 0,tip 2 h = 0,t 2 t3P3k = OMPAh = 0,t 2 < t2,t 2 h < h}. (16) 



In the next section we will see how to prove in KAT an equation such as ( 15 ) in the presence of a set of 
assumptions such as ([To). 



6 Deciding Hoare Logic 

Rephrasing the observation in the end of last section, we are interested in proving in KAT the validity of 
implications of the form 

byprffy =0 A •••A b m p m ty n = A ci <c[ A---Ac n <c'„ -> bpF = 0. (17) 

This can be reduced to proving the equivalence of KAT expressions, since it has been shown, cf. |[T5l . 
that for all KAT expressions ri, . . . ,r n ,ei,e 2 over £ = {pi,. . . ,pk} and T = {t\, . . .,ti}, an implication of 
the form 

r\ = A • • • A r n = -> e\ = e 2 
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is a theorem of KAT if and only if 

e\_+uru = e2 + uru (18) 

where u = (p\ H \- pt)* and r = r\ + . . . + r n . Testing this last equality can of course be done by 

applying our algorithm to e\ + uru and e 2 + uru. However, in the next subsection, we present an alter- 



native method of proving the validity of implications of the form 17 This method has the advantage of 
prescinding from the expressions u and r, above. 

6.1 Equivalence of KAT Expressions Modulo a Set of Assumptions 

In the presence of a finite set of assumptions of the form 

T = {b lPX ¥ x = 0, . . . AnPmK, = 0} u {Cl < c[, . . . ,C n < C' n } (19) 

we have to restrict ourselves to atoms that satisfy the restrictions in Y. Thus, let 

At r = { a G At | a < c -> a < c', for all c < c G T }. (20) 

Given a KAT expression e, the set of guarded strings modulo T, GS Y {e), is inductively defined as 
follows. 

GS T ( P ) = {a P p\a,peAt r AV bpF=0&r (a<b^p<b')} 

GS r (b) = {aeAt r |a<6} 

GS T (e l +e 2 ) = GS r (e l ) U GS r (e 2 ) 

GS r (e l e 2 ) = GS r (ei)oGS r (e 2 ) 

GS T (e*) = U n > GS F (e)". 

The following proposition characterizes the equivalence modulo a set of assumptions T, and ensures 
the correctness of the new Hoare logic decision procedure. 



Proposition 6. Let e\ and e 2 be KAT expressions and V a set of assumptions as in (19 ). Then, 

KAT,rh ei =e 2 iff GS r (ei) = GS r (e 2 )- 



Proof. By ( [18] ) one has KAT,T h e\ = e 2 if and only if e\ + uru = e 2 + uru is provable in KAT, where 

11 = {Pi H 1 - Pk)* and r = b\p\b\ H Vb m p m b' m +c\c[ H hc„c' r The second equality is equivalent 

to GS(ei + uru) = GS(e 2 + uru), i.e. GS(^i) U GS(uru) = GS(e 2 )LlGS(uru). In order to show the equiv- 
alence of this last equality and GS r (ei) = GS r (e2), it is sufficient to show that for every KAT expression 
e one has GS r {e) = GS(e)\ GS(uru) (note that A U C = B U C ^ A\C = B\C). 

First we analyze under which conditions a guarded string x is an element of GS(uru). Given the 
values of u and r, it is easy to see that x G GS{uru) if and only if in x occurs an atom a such that a < c 
and a j£ d for some c < c' G T, or x has a substring a/?/3, such that a < b and a^b' for some G T. 
This means that x GS(«r«) if and only if every atom in x is an element of At r and every substring 
of x satisfies (a < b — > j8 < b'), for all bpb' = G T. From this remark and by the definitions of At r and 
GS r , we conclude that GS r (e) n GS(uru) = 0. Note also that, since GS r (e) is a restriction of GS(e), one 
has GS r (e) C GS(e). Now it suffices to show that for every x G GS(e) \ GS(uru), one has x G GS r (e). 
This can be easily proved by induction on the structure of e. □ 
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We now define the set of partial derivatives of a KAT expression modulo a set of assumptions F. Let 
e E Exp. If a At r , then A« (e) = 0. For a E At r , let 



*ccp\ 

A r //>,_/ {nfc' |ft/?F = OErAa<ft} i£p = p' 

A^(ft) = 
A« p (ei + ^) = A^(ei) U A r ap (e 2 ) 



A ap( e 0- e 2 ifE a (e!)=0 
ap (ei)-e 2 \J A T ap {e 2 ) ifE«(ei) = l 



A r ap (e*)=A r ap (e)-e*. 

Note, that by definition, n b' = 1 if there is no ftp = ft/jft' E T such that a < b and a E At r . The next 
proposition states the correctness of the definition of A„ p . 

Proposition 7. Let T be a set of assumptions as above, e E Exp, a E At, and pGl. Then, 

D ap (GS r (,)) = GS r (A^( e )). 

Proof. The proof is obtained by induction on the structure of e. We only show the case e = p, since 
the other cases are similar to those in the proof of Proposition [I] If a E" At r , then GS r (p) = = 
D ap (GS r {p)). Also, A T ap {p) = = GS r (A^(p)). Otherwise, if a E At r , then GS r (» = {app \ 
a,j8 E At r AV fcpF=0£r (a < ft ->• j3 < ft')}, thus D ap (GS r (p)) = {/3 E At r | j3 < ft' for all b P F = e 
r such that a < ft}. On the other hand, A T ap {p) = {lib' \ bpF = E T A a < ft}. Thus, GS r (A^(/j)) = 

GS r (c), where c = U bp W=o e r,a<b b '■ We conclude that GS r (c) = {/3 E At r | j3 < ft' for all ftpF = E 
T such that a < ft}. □ 

6.2 Testing Equivalence Modulo a Set of Assumptions 

The decision procedure for testing equivalence presented before can be easily adapted. Given a set of 
assumptions T, the set At r is obtained by filtering in At all atoms that satisfy c but do not satisfy c' , for 
all c < d E r. The function f has to account for the new definition of A Y ap . 

We compared this new algorithm, equiv r , with equiv when deciding the PCA presented in Sub- 
section 5.2 First, we constructed expressions r and u from Y, as described above and proved the 
equivalence of expressions tQp\t\p2t2{t3t2P'it^p/\)*t^U > J ruru and Q + uru, with function equiv. In this 
case \H\ = 17. In other words, equiv needed to derive 17 pairs of expressions in order to reach a 
conclusion about the correction of program P. Then, we applied function equiv r directly to the pair 
(topitip2t2(t3t2P3t4P4)*tst5,0) and r. In this case, \H\ = 5. Other tests, that we ran, produced similar 
results, but at this point we have not carried out a study thorough enough to compare both methods. 

7 Conclusion 



Considering the algebraic properties of KAT expressions (or even KA expressions) it seems possible to 
improve the decision procedure for equivalence. The procedure essentially computes a bisimulation (or 
fails to do that if the expressions are inequivalent); thus it would be interesting to know if, for instance 
the maximum bisimulation can be obtained. Having a method that reduces the amount of used atoms, or 
alternatively to resort to an external SAT solver, would also turn the use of KAT expressions in formal 
verification more feasible. Concerning Hoare logic, it would be interesting to treat the assignment rule 
within a decidable first-order theory and to integrate the KAT decision procedure in an SMT solver. 
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